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Abstract 

In this work, goodness-of-fit tests are adapted and applied to CMB maps to detect 
possible non-Gaussianity. We use Shapiro- Francia test and two Smooth goodness-of- 
fit tests: one developed by Rayner and Best and another one developed by Thomas 
and Pierce. The Smooth tests test small and smooth deviations of a prefixed prob- 
ability function (in our case this is the univariate Gaussian). Also, the Rayner and 
Best test informs us of the kind of non-Gaussianity we have: excess of skewness, of 
kurtosis, and so on. These tests are optimal when the data are independent. We sim- 
ulate and analyse non-Gaussian signals in order to study the power of these tests. 
These non-Gaussian simulations are constructed using the Edgeworth expansion, 
and assuming pixel-to-pixel independence. As an application, we test the Gaussian- 
ity of the MAXIMA data. Results indicate that the MAXIMA data are compatible 
with Gaussianity. Finally, the values of the skewness and kurtosis of MAXIMA data 
are constrained by IS*! < 0.035 and \K\ < 0.036 at the 99% confidence level. 
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1 Introduction 



Standard inflationary tlieories establisli tliat primordial density fluctuations 
in the Universe had a Gaussian distribution. These fluctuations grew because 
of the gravitational force and created the structures we see today in the Uni- 
verse. These fluctuations also left their imprint in the cosmic microwave back- 
ground (CMB) as primordial anisotropics, which should also follow a Gaussian 
distribution. Therefore, if we test CMB Gaussianity, we are testing Gaussian- 
ity of primordial fluctuations and the validity of Standard Inflation. We can 
do the study of Gaussianity in several spaces: real space, Fourier space and 
wavelet space. In this paper we work in real space and apply three goodness- 
of-fit tests. We study their power to distinguish between Gaussian and non- 
Gaus sian maps. A s an a pplication, we test the Gaussianity of the MAXIMA 
data. ICavon et al. find these data compatible with Gaussianity. We will 



refer to that paper for most of the goodness-of-fit application to the MAXIMA 
data. In the present paper we give constraints on the skewness and kurtosis 
of these data. 

The organization of the paper is as follows. Goodness-of-fit tests are presented 
and tested in Section 2. Section 3 is dedicated to analyse the MAXIMA data 
and to constrain their skewness and kurtosis values. Finally, Section 4 is ded- 
icated to discussion and conclusions. 



2 Goodnes-of-fit statistics 



Given a sample of uncorrelated and normalized CMB data, we want to answer 
the question: "how well the data agree with the population of a Gaussian 
distribution A^(0, 1)?". 



2.1 Shapiro- Francia test 



There are many goodness-of-fit methods to test Gaussianity (for a review see 
D'Agostino and Stephens, 1986). The Shapiro and Francia test is one of these 
methods (Shapiro and Francia, 1972). The statistic associated to this test 
study the correlation between a Gaussian distribution and our experimental 
data. We estimate a one-dimensional array c corresponding to the expected 
sorted values obtained from independent Gaussian simulations iV(0, 1). Then 
we define h = c/\\c\\, where | |c| p = J2iC^- Given our sorted data x, the Shapiro- 
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Francia statistic SF is defined as follows: 




where n is the number of data and a is its dispersion. Note that, if x = 
c ■ cr, as expected in the Gaussian case, then SF ^ 1. Thus, deviations from 
Gaussianity will result in values smaller than one, because, in this case, the 
correlation between x and c is smaller than for the Gaussian case. 



2.2 Smooth tests 



Smooth tests are constructed to discriminate between a predetermined func- 
tion f[x) and a second one that deviates smoothly from the former. Given a 
statistical variable x and n independent realizations (xi, . . . , Xn) = x, we want 
to test if the probability function of x is equal to f{x) (in our case A^(0, 1)). 
We consider an alternative probability density function f{x,6) (where ^ is a 
parameter vector) that deviates smoothly from f{x) and f{x,6Q) = f{x) (we 
consider 6q = 0). In other words, we want to test the null hypothesis: 9 = 9q. 

The Smooth tests we are going to consider are based on the so-called score 
statistic (these tests are widely explained in Cox and Hinkley, 1974). One 
defines the natural logarithm of the likelihood as £{x,9) = J27=i^oS f{xi,9), 
the vector U of components Ui{9) = di{x,9)/d9i and the matrix / of com- 
ponents Iij{9) = {Ui{9)Uj{9)) = -{dH{x,9)/d9id9j). The score statistic is 
closely related to the natural logarithm of the likelihood ratio and is given 
by S* = U'^{9o)I~^{9o)U{9o), where is the transpose vector of U. The null 
hypothesis is rejected for large values of S. 



2.2.1 Rayner-Best test 

These authors def i ne an alternative probability density function of orden k 
( Ravner and Best given by 



gk{x) = C{9i, . . . ,9k)exp ^i^iix) \f{x). 

^ i=i ^ 



The hi functions are orthonormal on / and /io(x) = 1. C is a normalization 
constant. Then, the score statistic associated to the k alternative is given by 

k ] ^ 

Sk = T.u- with u, = —J2 Hxj). 

1=1 V ^ j=i 
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When Gaussianity is tested, then hn{x) = Pn{x)/sn, with s„ = \/n\ and 
Po{x) = 1, Pi{x) = X and for n ^ 1: Pn+i{x) = xPn{x) — nP„__i(x) (Hermite- 
Chebishev polynomials). The statistic Sk is related to cumulants of order ^ k: 
S^ = n{fi^)\ S2 = Sr+ n{fi2 - 1)72, 8^ = 82 + n{fi^ - 3/ii)V6, ^4 = 
^3 + n{[fii - 3] - 6[/i2 - 1]) 724 and 55 = ^4 + n{fi^ - lO/ig + 15/ii) Vl20, where 
/ia = (E"=iXp/n. 

This test is directional, that is, it indicates how the actual distribution deviates 
from Gaussianity. For example, if Si and S2 are small and 5*3 is large, then 
the data have a large /is value and also have a large skewness value. 

When n ^ 00, the Sk statistic is distributed as a xl- This holds because if 
n ^ 00 then Ui is Gaussian distributed (sum of a large number of independent 
variables) . 



2.2.2 Thomas- Pierce test 

Thomas and Pierce (1979) take the cumulative distribution function of a 
iV(0, 1) variable x: y{x) = erf(x), where erf denotes the error function. If 
x is Gaussian, then y must be uniform distributed on the interval [0, 1], and 
the alternative probability function of y is given by 



gk{x) = exp\j20^y'-Ki9)] 
^ i=i ^ 



where K{6) is a normalization constant. The statistic score is then (with 
notation of Thomas and Pierce, 1979) (A; = 1, 2, 3, . . .): 

k . i .2 1 " / . 1 

= E ( E ) ' = ^ E [y'M - 



1=1 j=l V r=l 



The coeficients aij are given in Table 3 of Thomas and Pierc3 ( 19791 ) (e. g. 
an = 16.3172, aai = -022 = -27.3809). 

As it happens in the Rayner and Best test, when n ^ 00, the Wk statistic is 
distributed as a xl- Note that (y^) = (1 + j)^\ so this method is directional 
if we work with y variable. 

2.3 Gaussian simulations 



In Cavon et al. ( 2003| l the distributions of the previous statistics are calcu- 



lated. These distributions are obtained for 50000 independent Gaussian simu- 
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lations of maps with 2164 pixels (this number fits the number of central pixels 
in the MAXIMA map, later selected for analysis). The pixels are independent 
pixel-to-pixel. The data are normalized to zero mean and unit va riance before 



tests are applied. Plots of these distributions can be found in ICavon et al. 

f|2nn3h . 



2.4 Non-Gaussian simulations. Edgeworth expansion. 



As an example of how well these methods work on discriminating between 
Gaussian and non-Gaussian data, we analyse simulated non-Gaussian maps 
obtained through the Edgeworth expansion (Martinez- Gonzalez, 2002). The 
Edgeworth expansion allows to construct a distribution which has small devi- 
ations from Gaussianity, with desired values of skewness and kurtosis or any 
other cumulants. Given the Gaussian distribution G{x), one denotes the cu- 
mulant of order n by kn, then, for small values of these cumulants, one can 
construct the density function f{x): 

fix) = G(x)|l + Y: ^H^{x/V2) + 0(A;X)|, 



where Hn is the Hermite polynomial of degree n, and /c„ is the cumulant of 
degree n. In particular the skewness is ^3 and the kurtosis is k^. If we set 
all cumulants to zero except one, / may not be positive definite and be not 
normalized. However, for small deviations (small values of the cumulants), 
one can set to zero the negative values of / and then renormalize it, without 
disturbing the non-zero cumulants appreciably. 

We can use skewness (S) and kurtosis (K) as statistics. Given injected val- 
ues of 5* and K (input values) we construct simulations and their S and K 
distributions. We compare these distributions with distributions of 5* and K 
for Gaussian simulations. The power of these statistics is given in Table 1. 
This table also show that, as mentioned before, the cumulants do not change 
significantly after setting to zero the negative values of the probability and its 
renormalization. 

Over the same number of simulations with prefixed values of skewness and 
kurtosis, we calculate the power of the tests presented in this paper. The 
power for the 5*^ statistic is shown in the Table 2. The power of the Wk 
and SF statistics is shown, respectively, in the Tables 3 and 4. We can see 
that most of the presented tests have more power than the directly calculated 
skewness and kurtosis. The W2 statistic seems to be the best discriminator in 
most of the cases. 
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Table 1 

Average and dispersion for the skewness and kurtosis values obtained from 10000 
simulations drawn from Edgeworth expansions assuming skewness and kurtosis val- 
ues denoted by S&:k(in). The power P of these two statistics is also given in columns 
4 and 5. 



S&K(in) 


Mean/Disp (S) 


Mean/Disp (K) 


P(95/99%) (S) 


P (95/99%) (K) 


0.0&0.4 


5.95e-4/0.0607 


0.3170/0.1205 


7.86/2.13 


87.85/65.57 


O.l&O.O 


0.0965/0.0503 


-0.0342/0.0938 


57.51/28.36 


1.65/0.19 


0.1&0.9 


0.1030/0.0780 


0.7634/0.1524 


57.49/32.25 


67.26/37.05 


0.3&0.5 


0.2949/0.0628 


0.4179/0.1490 


56.22/31.6 


87.72/65.19 



Table 2 

Power at 95% and 99% confidence level for the statistics (notation used for the 
table 95%/99%). Results based on 10000 simulations Gaussian and non Gaussian 
simulations. The non Gaussian ones were obtained from the Edgeworth expansion 
for different values of skewness and kurtosis S&K(input). 



S&K(in) 


^3 


^4 


^5 


^6 


0.0&0.4 


8.95/2.46 


74.04/53.18 


66.48/36.26 


71.98/23.32 


O.l&O.O 


44.62/20.51 


33.4/12.93 


23.63/5.11 


16.7/0.65 


0.1&0.9 


49.58/31.16 


100.00/99.96 


99.99/99.91 


100.00/99.99 


0.3&0.5 


99.92/99.43 


99.95/99.75 


99.99/99.65 


99.96/98.35 



Table 3 

Power at 95% and 99% confidence level for the Wk statistics (notation used for the 
table 95%/99%). Results based on 10000 simulations Gaussian and non Gaussian 
simulations. The non Gaussian ones were obtained from the Edgeworth expansion 
for different values of skewness and kurtosis S&:K(input). 



S&K(in) 


Wi 


W2 




W4 


0.0&0.4 


6.80/1.67 


83.03/64.75 


77.20/58.76 


74.38/52.97 


O.l&O.O 


41.13/20.26 


33.22/14.31 


29.16/12.38 


25.61/9.11 


0.1&0.9 


53.33/33.90 


100.00/100.00 


100.00/100.00 


100.00/100.00 


0.3&0.5 


99.99/99.87 


100.00/99.98 


99.99/99.97 


99.98/99.90 



In this paper we have constructed non-Gaussian simulations wher e the Edge- 



worth expansion has cumulants of order 5 or higher equal to zero. In lCavon et al. 



it is considered the study of nonGaussian simulations with non zero 
cumulants of order 5 and 6. The presence of these cumulants seems to be 
better detected by the Shapiro- Francia test. 
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Table 4 

Power at 95% and 99% confidence level for the Shapiro-Francia statistic (notation 
used for the table 95%/99%). Results based on 10000 simulations Gaussian and non 
Gaussian simulations. The non Gaussian ones were obtained from the Edgeworth 
expansion for different values of skewness and kurtosis S&K(input). 



S&K(in) 


SF 


0.0&0.4 


70.82/47.30 


O.l&O.O 


30.97/13.69 


0.1&;0.9 


100.00/100.00 


0.3&0.5 


100.00/99.89 



3 MAXIMA Data Analysis. Results 



This analysis has been carried out by ICavon et al. Below we briefly 



summarise the main steps of this analysis. 

Suppose we have CMB data in real space. We do not need to have the data on a 
regular grid. Suppose that these data Xi have mean value (xj) = 0, and that the 
correlation matrix components are given by Cij = (xiXj), where brackets indi- 
cate mean value along several (infinity) realizations. We perform the Cholesky 
decomposition of the correlation matrix: C = LL^, then = J2j L~j^Xj are un- 
correlated and normalized data (zero mean, unit dispersion). Moreover, if the 
CMB distribution is multinormal then the i/i data are independent and Gaus- 
sian distributed, with zero mean and unit dipersion. The tests here described 
are applied to the i/i data. 

As a real application of this method we analyse the MAXIMA data (Balbi et 
al., 2000, Hanany et al., 2000). The pixels of the border of the observed region 
have specially high noise levels. Because of that, we have selected pixels in the 
central observed area. We have selected pixels with right ascension from 226.47 
to 238.24 degs and declination from 226.47 to 238.24 degs. The selected region 
has 2164 pixels. The selected data is then transformed by multiplying it by 
the inverse of the Cholesky matrix. Finally, we calculate the above introduced 
tests. 

After calculation of the statistics we compare these values with the distribution 
of the Gaussian case. The values of th e statistics of the da ta and the corre- 
sponding confidence levels are given in Cavon et al. ( 2003[ ). The value of all 



the statistics used in the goodness-of-fit analysis indicate that the MAXIMA 
data are compatible with Gaussianity. 

We have compared the values of the statistics with distributions of independent 
A(0, 1) data. To be sure that the decorrelation of data does not introduce 
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any artifact that could make this comparison not appropiate, we make the 
compa rison with simulations of MAXIMA, as it is explained in ICavon et al. 
No significant differences are found. 



3.1 Constraints on skewness and kurtosis 



We have found that MAXIMA data are compatible with Gaussianity. Now, 
we want to constrain the skewness and the kurtosis of MAXIMA data. Firstly, 
we constrain the values of the statistics 5*3 and 5*4. These values are directly 
related to the skewness and kurtosis (see Section 2.2.1). Since MAXIMA data 
are Gaussian and the number of data is relatively large, then 5*3 ~ Xi ciiid 
5*4 ~ xi (iiote that we renormalize the data to zero mean and unit variance, 
and therefore 5*^ ~ Xk-2 k > 2). We denote by S3 and 5*4 the values of S3 
and 5*4 such that the probability of obtain ^3 ^ ^3 and 5*4 ^ 5*4 is 99% (if 
Gaussianity holds). Then, the absolute value of the skewness is equal or smaller 
than (65*3/72)^/^ with a probability of 99% (because of the relation between 
the skewness and S3). In a similar way, the absolute value of the kurtosis is 
equal o smaller than (245*4/72)^/^ with the same probability (this constraint is 
calculated with the skewness equal to zero). Note that we are working with 
the transformed independent data yi of the original correlated MAXIMA data 
Xi. Therefore, the constraints that we have calculated are on the skewness and 
kurtosis of the i/i data, but we want to constrain these values on the Xi data. 
These two variables are related by the Cholesky matrix: Xi = J^jLijyj. We 
denote by Sy and Ky the skewness and kurtosis of the yi data, and by S^ 
and the skewness and kurtosis of the Xi data. Then, we have the following 
relations: 



Sx 



n ^ 

3 



(4) 



(x2)3/2 
(4) 



n 



MY 



-3 



n 



;x2)3/2 



UL3d' 



j \-^3l i 



where n is the number of data and (x^) = J^iiLji)"^. In this way, we calculate 
the constraints on S^ and K^j. once the constraints on Sy and Ky are set. 
The limits at the 99% confidence level for the skewness and kurtosis for the 
MAXIMA experiment are |^^| ^ 0.035 and \K^\ ^ 0.036. 
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4 Discussion and conclusions 



Three new methods to test Gaussianity are presented and tested. Non-Gaussian 
simulations are constructed with the Edgeworth expansion which includes de- 
viations of Gaussian distribution inclu ding non zero cumulants of order higher 
than 2. The statistic W2 developed by Thomas and Pierc3 ( 1979| ) seems to be 



the most powerful when there are only cumulants of order 3 and 4. 

A fundamental hypothesis of the methods developed here is that the data must 
be independent, but the MAXIMA data are dependent because the cosmic sig- 
nal as well as the instrumental noise are correlated. Then, we decorrelate them 
with the Cholesky decomposition. In this way, under the Gaussian hypothe- 
sis, the data are independent. This decomposition limits the number of data 
with which we can work, because the number of operations of the Cholesky 
decomposition and Cholesky matrix inversion is of order ~ 0{n^) (Press et 
al., 1994). Therefore, the calculations with large n are computationally very 
expensive. This is a limit to the method here presented. 

We have applied the methods to MAXIMA data. These data have been found 
to be compatible with Gaussianity under these statistical tests (Cayon et 
al., 2003). Constraints on skewness and kurtosis are set to 0.035 and 0.036, 
respectively. 

In this work we have tested the univariate Gaussian function. In the context 
of the Rayner and Best test it is possible to analyse directly the multinormal 
function. In this new approach, a set of ortonormal functions (Rayner and 
Best, 1990) on the multinormal function is constructed. This will be done in 
a future work and it will complete the analysis of the univariate Gaussian 
distribution. 
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